Search results for "Sparse matrix"
showing 10 items of 23 documents
Low-complexity AoA and AoD Estimation in the Transformed Spatial Domain for Millimeter Wave MIMO Channels
2021
High-accuracy angle of arrival (AoA) and angle of departure (AoD) estimation is critical for cell search, stable communications and positioning in millimeter wave (mmWave) cellular systems. Moreover, the design of low-complexity AoA/AoD estimation algorithms is also of major importance in the deployment of practical systems to enable a fast and resource-efficient computation of beamforming weights. Parametric mmWave channel estimation allows to describe the channel matrix as a combination of direction-dependent signal paths, exploiting the sparse characteristics of mmWave channels. In this context, a fast Transformed Spatial Domain Channel Estimation (TSDCE) algorithm was recently proposed …
Iterative sparse matrix-vector multiplication for accelerating the block Wiedemann algorithm over GF(2) on multi-graphics processing unit systems
2012
SUMMARY The block Wiedemann (BW) algorithm is frequently used to solve sparse linear systems over GF(2). Iterative sparse matrix–vector multiplication is the most time-consuming operation. The necessity to accelerate this step is motivated by the application of BW to very large matrices used in the linear algebra step of the number field sieve (NFS) for integer factorization. In this paper, we derive an efficient CUDA implementation of this operation by using a newly designed hybrid sparse matrix format. This leads to speedups between 4 and 8 on a single graphics processing unit (GPU) for a number of tested NFS matrices compared with an optimized multicore implementation. We further present…
CRiSPy-CUDA: Computing Species Richness in 16S rRNA Pyrosequencing Datasets with CUDA
2011
Pyrosequencing technologies are frequently used for sequencing the 16S rRNA marker gene for metagenomic studies of microbial communities. Computing a pairwise genetic distance matrix from the produced reads is an important but highly time consuming task. In this paper, we present a parallelized tool (called CRiSPy) for scalable pairwise genetic distance matrix computation and clustering that is based on the processing pipeline of the popular ESPRIT software package. To achieve high computational efficiency, we have designed massively parallel CUDA algorithms for pairwise k-mer distance and pairwise genetic distance computation. We have also implemented a memory-efficient sparse matrix clust…
Computing the Kekulé structure count for alternant hydrocarbons
2002
A fast computer algorithm brings computation of the permanents of sparse matrices, specifically, molecular adjacency matrices. Examples and results are presented, along with a discussion of the relationship of the permanent to the Kekule structure count. A simple method is presented for determining the Kekule structure count of alternant hydrocarbons. For these hydrocarbons, the square of the Kekule structure count is equal to the permanent of the adjacency matrix. In addition, for alternant structures the adjacency matrix for N atoms can be written in such a way that only an N/2 × N/2 matrix need be evaluated. The Kekule structure count correlates with topological indices. The inclusion of…
The Sliced COO Format for Sparse Matrix-Vector Multiplication on CUDA-enabled GPUs
2012
Abstract Existing formats for Sparse Matrix-Vector Multiplication (SpMV) on the GPU are outperforming their corresponding implementations on multi-core CPUs. In this paper, we present a new format called Sliced COO (SCOO) and an effcient CUDA implementation to perform SpMV on the GPU. While previous work shows experiments on small to medium-sized sparse matrices, we perform evaluations on large sparse matrices. We compared SCOO performance to existing formats of the NVIDIA Cusp library. Our resutls on a Fermi GPU show that SCOO outperforms the COO and CSR format for all tested matrices and the HYB format for all tested unstructured matrices. Furthermore, comparison to a Sandy-Bridge CPU sho…
An Advanced Numerical Model in Solving Thin-Wire Integral Equations by Using Semi-Orthogonal Compactly Supported Spline Wavelets
2003
Abstract—In this paper, the semi-orthogonal compactly sup- ported spline wavelets are used as basis functions for the efficient solution of the thin-wire electric field integral equation (EFIE) in frequency domain. The method of moments (MoM) is used via the Galerkin procedure. Conventional MoM directly applied to the EFIE, leads to dense matrix which often becomes computation- ally intractable when large-scale problems are approached. To overcome these difficulties, wavelets can be used as a basis set so obtaining the generation of a sparse matrix; this is due to the local supports and the vanishing moments properties of the wavelets. In the paper, this technique is applied to analyze elec…
Assessment of computational methods for the analysis of single-cell ATAC-seq data
2019
Abstract Background Recent innovations in single-cell Assay for Transposase Accessible Chromatin using sequencing (scATAC-seq) enable profiling of the epigenetic landscape of thousands of individual cells. scATAC-seq data analysis presents unique methodological challenges. scATAC-seq experiments sample DNA, which, due to low copy numbers (diploid in humans), lead to inherent data sparsity (1–10% of peaks detected per cell) compared to transcriptomic (scRNA-seq) data (10–45% of expressed genes detected per cell). Such challenges in data generation emphasize the need for informative features to assess cell heterogeneity at the chromatin level. Results We present a benchmarking framework that …
Hardware-efficient matrix inversion algorithm for complex adaptive systems
2012
This work shows an FPGA implementation for the matrix inversion algebra operation. Usually, large matrix dimension is required for real-time signal processing applications, especially in case of complex adaptive systems. A hardware efficient matrix inversion procedure is described using QR decomposition of the original matrix and modified Gram-Schmidt method. This works attempts a direct VHDL description using few predefined packages and fixed point arithmetic for better optimization. New proposals for intermediate calculations are described, leading to efficient logic occupation together with better performance and accuracy in the vector space algebra. Results show that, for a relatively s…
A branch and bound algorithm for the matrix bandwidth minimization
2008
In this article, we first review previous exact approaches as well as theoretical contributions for the problem of reducing the bandwidth of a matrix. This problem consists of finding a permutation of the rows and columns of a given matrix which keeps the non-zero elements in a band that is as close as possible to the main diagonal. This NP-complete problem can also be formulated as a labeling of vertices on a graph, where edges are the non-zero elements of the corresponding symmetrical matrix. We propose a new branch and bound algorithm and new expressions for known lower bounds for this problem. Empirical results with a collection of previously reported instances indicate that the propose…
LightSpMV: Faster CSR-based sparse matrix-vector multiplication on CUDA-enabled GPUs
2015
Compressed sparse row (CSR) is a frequently used format for sparse matrix storage. However, the state-of-the-art CSR-based sparse matrix-vector multiplication (SpMV) implementations on CUDA-enabled GPUs do not exhibit very high efficiency. This has motivated the development of some alternative storage formats for GPU computing. Unfortunately, these alternatives are incompatible with most CPU-centric programs and require dynamic conversion from CSR at runtime, thus incurring significant computational and storage overheads. We present LightSpMV, a novel CUDA-compatible SpMV algorithm using the standard CSR format, which achieves high speed by benefiting from the fine-grained dynamic distribut…